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FVALHATION 

Tlif  necessity  for  more  complex  software  systems  In  such  areas  as 
command  and  control  and  avionics  has  led  to  the  desire  for  better 
methods  for  predicting  software  errors  to  Insure  that  software 
produced  Is  of  higher  quality  and  of  lower  cost.  This  desire  has  been 
expressed  in  numerous  industry  and  Government  sponsored  conferences, 
as  well  as  in  documents  such  as  the  Joint  Commanders'  Software 
Reliability  Working  Group  Report  (Nov  1975) . As  a result,  numerous 
ef torts  have  been  Initiated  to  develop  and  validate  mathematical 
models  for  predicting  such  quantities  as  the  number  of  remaining 
errors  In  a software  package,  the  time  to  achieve  a desired 
reliability  level,  and  a measure  of  the  soltware  reliability.  However, 
carlv  et torts  have  not  produced  models  with  the  desired  accuracy  of 
prediction  and  w i t li  t lie  necessary  confidence  limits  for  general  model 
usage . 


This  effort  i.’.is  initiated  in  response  to  this  need  for  developing 
better  and  more  accurate  software  error  prediction  models  and  fits 
into  the  goals  of  RAPC  TPO  No.  S,  Software  Cost  Reduction  (formerly 
RADC  TPO  No.  11,  Software  Sciences  Technology),  In  the  subthrust  of 
Software  Ouality  (Software  Modeling).  This  report  summarizes  the 
development  of  a mathematical  model  for  predicting  quantities  such  as 
the  expected  number  of  remaining  errors,  achieved  reliability,  and 
time  to  detect  and  correct  a specified  number  of  errors  that  assumes  a 
software  error  is  not  corrected  at  a given  time  with  probability  1 
(l.e.  imperfect  debugging).  The  importance  of  this  development  Is  that 
It  represents  the  first  attempt  to  develop  software  error  prediction 
models  that  Incorporate  Imperfect  debugging,  and  thus  more  closely 
reflect  the  actual  software  error  detection  and  correction  process. 

Ttie  theory  and  equations  developed  under  this  effort  will  lead  to  much 
needed  predictive  measures  for  use  by  software  managers  In  more 
accurately  tracking  software  development  projects  In  terms  of  test 
time  needed  to  achieve  given  reliability  and  error  objectives.  In 
addition,  the  associated  confidence  limits  and  other  related 
statistical  quantities  developed  under  this  effort  will  insure  more 
widespread  use  of  these  modeling  techniques.  Finally,  the  predictive 
measures  and  equations  developed  under  this  effort  will  be  applicable 
to  current  Air  Force  software  development  projects  and  thus  help  to 
produce  the  high  quality,  low  cost  software  needed  for  today's 
systems. 
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2.  MODEL  DEVELOPMENT 

The  following  assumptions  are  made  for  developing  the  model. 

(i)  The  error  causing  a software  failure,  when  detected,  is 
corrected  with  probability  p(05p<  1)  .while  with  probability 
q(p+q=l)  we  fail  to  completely  remove  it.  Thus,  q is  the 
probability  of  imperfect  debugging. 

(ii)  Errors  in  the  software  package  are  independent  of  each  other 
and  have  a constant  occurrence  rate  X . 

(iii)  The  probability  of  two  or  more  errors  occurring  simultaneously 
is  negligible. 

(iv)  The  time  to  remove  an  error  is  considered  to  be  negligible 
in  this  model. 

(v)  No  new  errors  are  introduced  during  the  debugging  process. 

(vi)  At  most  one  error  is  removed  at  correction  time. 

Let  X (t ) denote  the  number  of  errors  remaining  in  the  package 
at  time  t . We  wiil  use  this  random  variable  to  describe  the  state 
of  the  error  process  at  time  t . Further,  let  N be  the  number  of 
errors  at  the  beginning  of  the  debugging  phase,  i.e.,  X(0)=N. 

Suppose  that  there  are  i errors  in  the  package  at  some  time. 

Then  from  assumption  (i),  we  note  that  after  the  occurrence  of  the  next 
failure 

Ii— 1 with  probability  p 

(2.1) 

i with  probability  q 

In  other  words,  if  we  were  to  observe  the  X(t)  process  at  times  of 
software  failures , then  its  behavior  is  governed  by  equation  (2.1). 

The  transition  probabilities  P^j  from  6tate  i to  state  j , 
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A diagrammatic  representation  of  transitions  between  states 
corresponding  to  equation  (2.2)  is  given  in  Figure  2.1. 

Now,  assumptions  (i)  and  (ii)  imply  that  the  times  between 
successive  software  failures  (error  occurrences)  follow  an  expon- 
ential distribution.  Suppose  at  some  time  t=T,x(x)  = i, 
i = 0,l,...,N.  Then  the  probability  density  function  (pdf)  f ^ ( t ) 
of  the  time  to  next  failure,  , is  given  by  the  distribution  of 
the  first  order  statistic  of  i exponential  distributions  each  with 
parameter  X , i.e., 

or  f±(t)  = i\-e~ikt  (2.3) 

and  the  cumulative  distribution  function  (cdf)  is  given  by 


We  note  that  even  thouyh  the  stochastic  process  X(t)  makes 
transitions  from  state  to  state  in  accordance  with  equation  (2.2), 
the  times  spent  in  various  states  are  random  and  are  given  by 
equation  (2.3).  Hence  (X(t)  , tiO)  forms  a semi-Markov  process. 

A typical  realizatiori  of  this  process  is  shown  in  Figure  2.2.  It 
should  be  pointed  out  that  in  our  formulation  the  process  X(t) 
undergoes  both  real  and  virtual  transitions.  This  means  that  after 
an  attempt  to  remove  an  error  the  state  of  X(t)  may  change  or  may 
remain  unchanged.  In  Figure  2.2  real  transitions  occur  at  states 
N , N-2  and  i while  a virtual  transition  occurs  at  state  N-l  . 

Let  Qi j ( t ) denote  the  one  step  transition  probability  that 
after  making  a transition  into  state  i,  the  process  X(t'  next 
makes  a transition  into  state  j by  time  t . In  other  words  if  a 
software  package  has  i remaining  errors  at  time  zero,  then  Q^j(t) 
represents  the  probability  that  the  next  failure,  resulting  in  j 
remaining  errors,  will  be  by  time  t.  Hence,  for  i, j ■ 0, 1,2, , .. ,N  , 
we  can  write 

.t 


(t)  P(x(u)  **  j # T^*ulX(0)  = i) • du  . 


Since  the  events  (X(u)"j)  and  (T^uJ  are  independent,  we  get 
rt 

0ij(t)  ^ P(X(u)«j  IX(0)  =i)  P(Ti=ulX(0)=iJ -du 
» 

rt 

"j  Pij-P(Ti-ulX(0)-iJ -du 


pu  S0  u-'Uu-du 
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or 


Oijlt)  - 

for  i,  j ■ 0, 1, 2 N . 


(2.5) 


It  is  obvious  that  Q^tt)  must  satisly 

QjL  j (t ) * 0 , i,  j - 0, 1, 2, . . . ,N  , t > 0 


and 


N 

^ Qij  (»)  *=  p*q  -1  , i-0,1 N . 

j-0 

The  probabilities  Q^(t)  are  obtained  by  multiplying  the 
probabilities  P^  from  (2.2)  and  FA(t)  from  (2.4).  Thus,  for 
example, 


°N,N-l(t)  “ PN,N-l‘FN(t) 
QN,N-l(t)  " p(l-e"NVt)  • 


or 


Proceeding  similarly  for  all  i,j  we  get  (Qij(t))  as  shown  in 
Equation  (2.6)  on  the  following  page. 

For  known  parameters  N,  p and  X , the  probabilities  (t) 

are  obtained  from  Equation  (2.6).  This  equation  represents  the 
basic  model  that  will  be  used  in  the  following  sections  for  obtain- 
ing the  various  quantities  of  interest  for  the  software  error 
phenomenon. 


7 


J.  DERIVATION  OK  VARIOUS  QUANTITIES  OK  INTEREST 


3 • 1 Distribution  of  Time  to  a Completely  Debugged  Software  System 

Suppose  i is  the  number  of  errors  remaining  in  a software 
system  at  some  time  during  the  debugging  process.  Let  g.  n(t) 
and  q (t)  denote  the  pdf  and  cdf,  respectively,  of  the  first 
passage  time  from  i to  0 . In  other  words  g.  n(t)  and  G.  n(t) 
represent,  respectively,  the  pdf  and  cdf  of  the  time  required  to 
obtain  a completely  debugged  software  system  when  the  initial  number 
of  errors  is  i . 

Recall  that  at  time  zero,  X(0)  = N and  at  the  time  of  the  next 

failure 


X(t) 


N-l  with  probability  p 
N with  probability  q 


(3.1) 


as  shown  in  Figure  2.1.  Now,  from  the  definition  of  Q.  , (t)  , 

r • J 

the  probability  of  going  from  N to  N-l  errors  in  time  [u,u+du] 
is  dQN  ^_^(u)  • Then  the  process  X(t)  restarts  with  (N-l)  remain- 
ing errors  at  time  u and  the  cdf  of  the  first  passage  time  is 
°N-1  ' For  t*'e  caso  perfect  debugging  the  cdf  of  the  first 

passage  time  is 


5L  °N-l,0(t_u) ‘dQN,N-l(u)  " °N,N-1*  °N-l,0(t)  * 


(3.2) 


where  * denotes  convolution. 

Similarly,  if  the  debugging  at  the  first  error  occurrence  is 
imperfect,  the  cdf  of  the  first  passage  time  is 
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L 0S.O(t-U)‘d0ll,H(u)  * 


(1.3) 


Since  the  events  depicted  in  Equations  (3.2)  end  (3.3)  ere 
mutually  exclusive,  we  get  the  renewal  equation 


(3.4) 


in  general,  we  get  the  renewal  equation 


Gi,0(t)  " Qi#i_i  * °i>if0(t)  * °i,i  *Qi,0(t) 


(3.5) 


for  i - 1,2, . . . ,N 


where  GQ  Q(t)  - 1 . 

We  use  Laplace-Stieltjes  (Lr-S)  transforms  to  solve  renewal 
equations  (3.5),  where  the  I/-S  of  ^(t)  i*  defined  ast 


(3.6) 


From  (3.5)  we  get 


2^  q (■ ) ■ i— 1 1® )^*i»l  o ^ + Gi  i^*^Gi  0^*^  * ^"3.,2,...,N  (3.7) 


where 


*i. *.!<■>- 


^i,i(s)  " aVa  * 


(3.8) 


(3.9) 


Solving  (3.7)  recursively,  we  get  the  Ir-S  transform  of  o„  n(t) 

N|  w 


10 


°N.0(S) 


where 


M N 

n ■ iE*-  = V c —ii 

A s + jp\  N.  3 s + 

J “ A i 


■ O'-1-'"1 


(3.10) 


(3.11) 


By  taking  the  inverse  of  GN  Q(s)  , the  cdf  of  the  first  passage 


time  from  N to  0 is: 


j-i 


(3.12) 


The  pdf  of  the  first  passage  time  from  N to  0 is  given  by 


P I 

%.0(t)  = S CN.j*jpX< 


“jP^t 


(3.13) 


To  illustrate  the  above  result  let  us  consider  a software 
system  with  N = 10,  1 = 0.02  and  p = 0.8.  Then 

j-1 

The  values  of  this  function  for  various  t are  plotted  in 
Figure  3.1.  From  this  plot  we  note  that  the  probability  of  getting 
an  error  free  system  by  275  time  units  is  0.9  and  by  500  units 
is  1.0.  Such  a plot  is  useful  for  calculating  the  time  required 
to  get  an  error  free  system  with  a desired  probability. 

Similar  plots  for  values  of  p=.85,  .90,  .95  and  1.0  are  also 
shown  in  Figure  3.1.  As  would  be  expected  the  cdf  for  a larger  p 
dominates  that  for  a smaller  p . In  other  words  the  better  the 
debugger,  the  faster  is  the  process  of  debugging. 
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3 . 2 Distribution  of  Time  to  a Specified  Number  of  Remaining  Errors 
In  many  instances  a completely  debugged  software  is  not  cost 
effective  and  we  may  be  willing  to  tolerate  a certain  number  of 
remaining  errors,  say  n^  , which  will  ensure  some  desired  reliability. 
The  distribution  of  time  to  n^  is  then  of  interest. 

Using  an  approach  similar  to  that  of  Section  3.1  we  get  the 
renewal  equation 


°L.n0U)  m • 


for  i « nQ+ 1, . . . ,N 


(3.14) 


where  G_  _ (t)  - 1 . 
n0'n0 

Then  the  I/-S  transform  of  GM  (t)  is  given  by 

N'n0 

« J ipv  . V°B  (v3)px 

«*  n. v ’ s+  jp\  ^ N,j,nQ  s+  (nft+j)pX 

j=n0+l  j«l 

where 


(3.15) 


3 E . ^ » /_l\j“^  j 

BN,j,n0  nQ: j! (N-nrt-j)i  nn + j • 


(3.16) 


The  cdf  is  obtained  by  taking  the  inverse  L-S  transform  of 


..  (*)  * 


N,  n 


N-n, 


°K.n  «« 


and  the  pdf  is 


S BN, j#nnl 1 ~ 


-(nQ+ j)pXt 


}• 


(3.17) 


j-1 


N-nr 


E“(n0+j)pXt 

BN,j,n0(nO^)pXe 
w j-1  0 


(3. 18) 
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To  see  the  nature  of  the  pdf  and  cdf,  let  us  consider  the 
case  when  N ■ 10,  Xs.  02  and  p*»0.9.  These  are  shown  in  Figures 
3.2  and  3 . 3 .respectively,  for  various  values  of  n^  . The  plots  are 


,th 


self  explanatory. 

Now  let  a random 

variable 

from  N to  nQ  errors. 

Then, 

moment  of  T„  _ as 
N,n0 

N-n 

E[T;  1 = 
L N'n0  J 

j®l 

( (nn+j)jp  JJ 


(3.19) 


From  (3.19),  the  mean  and  variance  are 

N-n„ 

ET, 


x.n.  * IZ 

0 j-1  0 


v“(VnJ-  <»,  -K.nJ 


(3.20) 


(3.21) 


‘O'  '0  v "'"O' 

The  values  for  mean  and  variance  of  first  passage  time  for  various 


50  100 


150  200  250  300  350  400  450  500 

TIME 


Figure  3.3  CDF  of  Time  to  a Specified  Number 
(nn)  of  Remaining  Errors 


3. 3 Distribution  of  Number  of  Remainin' 


First,  we  develop  the  expressions  for  the  distribution  of  the 
number  of  remaining  errors  after  a specified  time  period,  t . Then, 
the  expected  number  of  remaining  errors  at  time  t is  obtained. 

Let  P (t)  represent  the  probability  that  there  are  nn 

« i Hq  u 

errors  remaining  in  a software  package  at  time  t , given  that  there 
are  N errors  at  the  beginning  of  debugging,  i.e.. 


PN,nQ(t)  = P[X(t)=n0IX(0)=N) 


(3.22) 


which  is  the  so-called  state  occupancy  probability.  Conditioning 
on  the  next  failure  and  following  an  approach  similar  to  that  of 
Section  3.1,  we  get  the  following  renewal  equation. 


(t)  = e u + Q„  „ *P„  „ (t)  , n <N 


*0'  0 “0"‘0  **0'“0 
Conditioning  on  the  first  passage  time,  we  get 


(3.23) 


N,  n. 


(t)  = P 


no'no  N,n0 


(t)  , 


no<N  • 


(3.24) 


By  taking  the  L-S  transform  of  Pn  (t)  and  rearranging,  we 

n0'  n0 


get 


n0'n0 


(s) 


s + nQp\ 


1 - 


nQP\ 


s + nQpX 


(3.25) 


Substituting  (3.25)  into  the  L-S  transform  of  PM  (t)  , we  obtain 

N / 


N,  n. 


(s)  = „ (s)  - 


N,n. 


s + nQpX  N,nQ 


(s) 


= GM  n (s)  “ Gn  n -1  (S)  * 
N,np  N,nQ-r 


(3.26) 
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By  taking  the  inverse  L-S  transform  of  PN  n (a)  We  get 


N,  n, 


(t)  - G, 


* (t)  "Gn  n -l(t) 
\Q  N,nQ-l 


Hq  = 0, 1, 2 , . . . , N 


Where 


GN,N(t)  * 1 ' 
°N,-l(t)  ■ °* 


(3.27) 


Figure  3.4  shows  P (t)  for  various  nn  , where  N ■ 10,  p*»0.9, 

N 0 Uq  U 

and  X“0.02  . From  this  figure  we  can  see  how  the  distribution  of 
the  number  of  remaining  errors  changes  with  time. 

Now,  we  obtain  the  expected  number  of  remaining  errors  in  the 
software  at  time  t as  follows: 

N 

E[X(t) IX(O)-H)  - £ ■>„?  («:) 

"0‘° 

N 

no“° 

* Ll1-G«vt>l 

n0-° 

Now,  using  the  expression  in  (3.17),  we  get 

E(x(t) lx(0)-N]  - Ne~pXt  . (3.28) 

Figure  3.5  shows  the  expected  number  of  remaining  errors  at 
time  t for  various  p,  where  N-10,  and  \-0.02  . As  can  be  seen, 
software  errors  can  be  eliminated  faster  if  larger  values  of  p are 
chosen.  In  other  words,  a good  debugger  can  eliminate  software 
errors  fast.  For  example,  for  n^  ■*  1 a debugger  with  p=l  requires 
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Figura  J.4  Probability  Distributions  of  Nomhwr  of 
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Figure  3.5  Expected  Number  of  Remaining 
Errors  versus  Time  t 
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debugging  time  t - 118  , and  the  debugger  with  p-0.8  requires 
t » 148.  The  difference  between  the  two  debuggers  is  30  in  the  sense 
of  expectation. 
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3.4  Expected  Number  of  Errors  Detected  by  Time  t 

We  introduce  a new  random  variable  N(t)  which  denotes  the 
total  number  of  errors  detected  by  time  t . The  process  (N(t),  tiO] 
is  called  a counting  process.  We  are  Interested  in  obtaining  the 
expression  for  the  expected  number  of  errors  detected,  during  the 

debuggitig  period,  t , when  the  initial  number  of  errors  is  N , i.e. 

^(t)  = E [N  (t)  IX  (0)  * N)  (3.29) 

which  is  called  a Markov  renewal  tunctron.  By  conditioning  on  the 
next  software  failure,  we  obtain  the  renewal  equations. 

M.  (t)  *=  Fj  (t)  + pM._i*P.  (t)  + qM.*Fi  (t)  . j-1,2 N (3.30) 

where  ( t ) =0  . 

Using  the  b-S  transforms  of  M.(t),  i“l , 2 , . . . ,N,  we  get 

" iuk-pEw'1-  (3-311 

k=l  j=k  k=l 

The  expression  for  ( t ) in  terms  of  the  first  passage  time  distri- 
bution is  then  given  by 

N 

V*’  ■5EW,|-*|IVPU|  • (3-32) 

k-1 

Note  that  if  we  let  t -»  ® we  have 

^1®)-!  (3.33) 

which  is  the  expected  number  of  software  errors  detected  by  the  end 
o t debugg i ng . 

Figure  3.6  shows  the  expected  number  of  errors  detected  by 
Mjjtt),  for  various  N when  p«0.9  and  \*0.02. 
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Let  us  now  consider  the  case  when  the  detected  errors  are 
separated  as  new  errors  and  errors  which  were  not  corrected  due 
to  imperfect  debugging.  Let  Nj(t)  be  a random  variable  which 
denotes  the  total  number  of  imperfect  debugging  errors  by  time  t 
Then  we  can  show  that 

DN(t)  - qM^t)  , (3.34) 


where 


DN(t)  - E (Nj (t) | X ( 0) *N ] . 

Note  that  D.,  (•)  ■ q — . 

N P 

Plots  of  MN(f)  and  (t)  for  the  case  when  N«=10,  p=0.9 


and  X«0.02  are  shown  in  Figure  3.7. 
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In  the  previous  section  we  studied  the  stochastic  behavior  of 
the  number  of  errors  in  the  software  system  during  the  debugging 
period.  In  this  section  we  investigate  the  distribution  of  the 
time  between  software  failures  and  study  the  problem  of  reliability 
growth.  From  Section  2 recall  that  the  random  variable  T ^ denotes  the 
time  to  next  failure  when  the  number  of  remaining  errors  is  i and 
F^(t)  is  the  cdf  of  T\  . Let  denote  the  time  between  the 

(k-l)st  and  kth  software  failures  and  »^(x)  be  the  cdf  of  . 

Note  that  X^  does  depend  on  the  number  of  remaining  errors  at  the 
(k-l)st  failure  but  this  number  is  not  explicitly  known.  Further, 
let  7)^  , a r.v.,  denote  the  number  of  remaining  errors  between  the 
(k-l)st  and  kth  software  failures.  Then,  from  Section  2 we  have 

- N (4.1) 

*l(x)  = Fn(x)  , (4.2) 

and 

*2(x)  - pFjj_  ^ (x ) + qF^  (x ) . (4.3) 

In  general,  we  have 

N 

#k(x)  - pfX^Sx)  - p(Xk  Sxl7?k=i)p(\=i)  (4.4) 

i-N-(k-l) 

or 
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*k(x)  = p(XkSxl7Jk  = N-k+j+l)p(?7k  = N-k+j+l) 
j=0 

■ ZTj1)  <x>  • 


(4.5) 


This  is  called  a mixture  of  exponential  distributions  with  binomial 
mixing  portions.  As  proved  in  Barlow  and  Proschan  [1], 

#k(x)  is  a decreasing  failure  rate  (DFR)  distribution.  The  reli- 
ability function  at  the  kth  stage,  i.e.,  between  (k-l)st  and  kth 
failure,  is  given  by 

Rj^x)  = >x) 

■ 1 ” • * (*) 


P : ^ FN-(k-j-l) (x) 


(4.6) 


where 


Fn(x)  = 1 - Fn(x)  = e 


-NXx 


(4.7) 


Also  the  corresponding  failure  rate  is  given  by 


rk(x)  - ^k(x)/Rk(x) 


(4.8) 


where  ^(x)  ia  the  P-d.f.  of  xk  . The  behavior  of  R^x)  with 
respect  to  k is  of  interest.  To  study  this  behavior  we  have  the 
following  theorem. 

Theorem:  The  reliability  function  R^x)  is  increasing  in  k for 
any  time  x>0,  i.e. 
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Proof:  It  suffices  to  show  that 

VR^x)  ■ Rfc+iW  " ** (x)  (4.10) 

is  positive  for  x > 0 . Then  we  have 
k-1 

’VX)  - SCj  )P,''313  (fN-(k-j)<X>  -?N-(k-j-l)«X,)  • <4-11> 

j-0 

It  holds  that  for  x>0,  j -0,1,2,... 

*N-(k-l)(x)  * *N-(k-j-l)  (X)  * (4.12) 

Hence  we  get 

TR^fx)  >0  for  x > 0 . (4.13) 

Q.E.D. 

The  reliability  growth  curves  are  shown  in  Figure  4.1,  where 
N ■ 10,  p-0.9  and  X-0.02.  The  p.d.f.'s  and  the  failure  rates 
of  are  shown  in  Figures  4.2  and  4.3,  respectively. 

Note  that  the  number  of  software  errors  remaining  at  the  time 
between  (k?l)st  and  kth  software  failures  is  N-(k-I-l)  , where  the 
random  variable  I is  distributed  as  a binomial  with  (k-l,q)  . 
Therefore,  the  expected  number  of  software  errors  remaining  is  given 
by  N-p(k-l)  . This  observation  will  be  useful  in  constructing  a 
likelihood  function  to  estimate  unknown  parameters. 


0 5 10  15  20 

TIME 


Figure  4.1  Reliability  Growth  Curves 
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5.  GAMMA  APPROXIMATION  FOR  A LARGE-SCALE  SOFTWARE  SYSTEM 


In  Section  3 wt>  obtained  the  quantities  of  interest,  e.g. 

state  occupancy  probability  and  renewal  function,  in  terms  of  first 

passage  time  distribution.  Once  we  have  computed  GH  ( t ) , we  can 

N , nQ 

easily  obtain  P (t)  and  M (t)  . However,  it  should  be  noted 
n , nQ  n 


that  the  computation  of  G 


N , n, 


(t)  is  almost  impossible  for  a large- 


scale  software  system  because  of  the  difficulty  in  computing  the 

coefficient,  B„,  , . Through  numerical  study  we  have  found  that 

N,  3 , n 

the  computations  become  very  messy  and  almost  impossible  for 
N - nQ  > 20  . In  this  section  we  study  methods  for  obtaining  approx- 
imate solutions  for  these  quantities. 

Of  prime  interest  is  the  approximation  of  first  passage  time 
distribution  by  using  a Gamma  distribution.  From  a study  of  the 
pdf's  of  first  passage  times  in  Figure  3.2,  we  feel  that  these  dis- 
tributions might  be  approximated  by  Gamma  distributions.  We  use 
the  method  of  moments  to  obtain  estimates  of  the  parameters  of  a 


Gamma  distribution  corresponding  to  G 


N,  n 


(t  ) 


In  order  to  do  that. 


0 


we  first  discuss  how  to  obtain  the  moments  of  G_,  (t)  without 

N,n0 


computing  the  coefficient  B 


N , j , n 


0 


Lot  T.,  be  a random  variable 
N,n0 


which  denotes  the  first  passage  time  from  N to  tig  . The  random 
variable  of  holding  time  at  state  N , denoted  by  T^  , has  an  exponential 


distribution  with  parameter  N' 


Therefore,  we  have 


* ETN  = 1/NX 


Var(TN)  - E(TN-uNP  - 1 / (NX  ) *’ 


(5.1) 

(5.2) 


The  following  recursive  equations  are  easily  obtained: 
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f 


T..  „ • Tm  - pTK,  . + qT 

N t N N- 1 1 Hq  N » Oq 


TN-1, n_  " TN-1 + pTN-2,nn + qTN-l,nr 


TVl*no  ’ TV1  + QTVl,no 


Solving  (5.3)  recursively,  we  got 


PN.n0  ■ i E T)  • 

0 j-nrt+l 


(5.3) 


(5.4) 


Then,  wo  hove 


“»."0  ■ ET“'"o 


N.n„  " p E ET)-5  E 1/51 

j-n0+l  3-V1 


(5.5) 


N N 

v“<Vn  > " A E V*r(V  ‘ J 5-  ‘ <J'’ 

0 P J-n.+  l P J>n0.l 


(5.6) 


These  are  identical  to  the  one*  obtained  in  Section  3.2.  Suppose 

the  Gamma  distr Uaution  corresponding  to  GM  (t)  has  a shape  para- 

n , ng 

meter  q and  a scale  parameter  fl  , so  the  mean  and  variance  are 

given  by  <y/0  and  n/l*  , respectively.  Then  the  parameters  a *nd 

0 can  be  estimated  by’  using  the  mothod  of  momenta,  i.e., 

N 

p l/(jM  “ (5‘?> 

j-V1 


X V l/(j\)2  - 
P ).yl 


(5.8) 


Therefore,  we  have 
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Z VtjU 


j=nO+i 


(5.9) 


V(jx)2 

j=n0+l 

^ = [ 52  i/(jx)]2/  52  v(jo2. 


(5.10) 


j=no+! 


j=n0+i 


Numerical  examples  for  various  nQ  are  given  in  Table  5.1, 
where  N = 100,  p=0.9  and  X = 0.02  . We  also  compute  the  relative 
losses  for  third  and  fourth  moments  around  the  mean  to  see  how  good 
the  approximations  are.  Since  the  third  and  fourth  moments  around 
the  mean  of  a Gamma  distribution  with  parameters  a and  0 are  given 
by  2a/i ^ and  9 a/d^  , respectively,  we  define  the  relative  losses 
for  third  and  fourth  moments  around  the  mean  as 


'^Vn^N.n/-2*/"3' 

E(TN,n0_iAN,n0) 


IE<Vn0-*N,n0»4-WB4l 

E(TN,n0"‘AN,n0) 


(5.11) 


(5.12) 


respectively,  where 


N,  n_  N,  n. 


75  L_EIT)-“i 


1 / ( j X ) (5.13: 


j=n0+! 


j=no+ 1 


Table  5.1 

Gamma  Approximation  for  First  Passage  Time  Distributions 
(N  - 100,  p - 0.9,  X - 0.02) 


H 

Mean 

Variance 

A 

A 

Relative  Loss  (X)  1 

a 

6 

■niinlT 

125.47 

263.01 

59.85 

0.477 

mm 

4.21 

103.84 

168.34 

64.05 

0.617 

2.70 

88.31 

119.82 

65.09 

0.737 

1 1 

1.92 

25 

76.19 

90.31 

64.28 

0.844 

13.19 

1.44 

30 

66.24 

70.47 

62.27 

0.940 

10.37 

1.12 

35 

57.81 

56.23 

59.44 

1 

8.14 

0.89 

40 

50.49 

45.49 

56.04 

1 I 1 

6.35 

0.72 

45 

44.02 

37.12 

52.21 

1.186 

4.92 

0.58 

50 

38.23 

30.40 

48.07 

1.257 

3.77 

0.47 

55 

32.99 

24.90 

43.70 

1.324 

2.84 

0.39 

60 

28.19 

20.30 

39.15 

1.389 

2.09 

0.31 

70 

19.70 

13.07 

29.69 

1.507 

1.03 

0.20 

80 

12.33 

7.63 

19.92 

1,616 

0.41 

0.11 

90 

5.82 

3.39 

9.99 

1,716 

0.09 

0.05 
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N N 

E(TN,no-NV  *“*1  E E(Tj"kij)  +6  E Z ElT^n  )2E(T,-h,)21 

F j=nQ+ 1 i=nQ+ 1 j>i  11  D D J 

E i/(jx,4+e  t £ (1/iX )2 (1/jX2]  . (5.14; 


J-no+1 


i=nQ+l  j >i 


Figure  5.1  shows  the  relative  losses  for  third  and  fourth 
moments  around  the  mean  with  N,  where  p=0.9,  \ = 0.02  and  nQ=o.2N 
As  we  see  in  this  figure,  the  maximum  relative  losses  for  third  and 
fourth  moments  around  the  mean  are  about  17%  and  10%  , respectively. 
This  means  that  the  Gamma  approximation  of  first  passage  time  dis- 
tributions for  large-scale  software  systems  is  reasonably  good. 

Plots  of  first  passage  time  using  Gamma  approximation  for 
N=100 , p=0.9  and  X=0.02  are  given  in  Figure  5.2  for  n^=0 , 1 , 2 , 3 , 5 , 
and  9.  Also,  plots  of  state  occupancy  probabilities  using  this 
approximation  are  given  in  Figure  5.3. 
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Figure  5.1  Relative  Loss  for  the  Third 
and  Fourth  Moments 
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Figure  5.3  State  Occupancy  Probabilities 
Using  Gamma  Approximations 


6.  CONCLUDING  REMARKS 


An  imperfect  debugging  model  ( I DM)  for  software  systems 
was  developed  in  this  report.  Various  quantities  of  interest  were 
derived  in  terms  of  the  first  passage  time  distribution  of  the 
underlying  semi-Markov  process.  Computations  for  and  usefulness 
of  these  quantities  were  illustrated  via  numerical  examples. 

An  approximation  method  for  obtaining  these  quantities  for  large- 
scale  software  system  was  also  presented. 

It  should  be  pointed  out  that  most  of  the  models  reported 
in  the  literature,  for  example  the  models  in  (3),  (6],  (9),  110], 
and  [13],  are  special  cases  of  I DM . 
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